RStudio has four main windows, that often have more than just one purpose. Figure 1 provides a brief description of each window. We will use all of them during our module, but the most important ones will be the console and editor pane.
Before you start working with R, you should undertake the following steps to facilitate your workflow:
Step 1. Create a folder.
Before every session, create a folder on your computer to save all your worksheets, datasets and solutions. Make sure you remember where that folder is and that you know the path to that folder.
Give your folder any name that makes it easy to remember what it contains, e.g. 'RWorkshop1' in your 'BS2280' folder.
If you use university computer, create a folder on the university network drive (H:), so that you can access your data on any PC on campus.
Step 2. Open R and set your working directory.
This step is to tell R to use the folder you created. R will save all files in there and, if you want to open a dataset, R will also look in this folder first.
Menu bar → Click Session → Set Working Directory → Choose Directory → Select your created folder in step 1
Example. Suppose I create a folder called RWorkshop1 under Teaching BS2280 on my desktop, and select folder RWorkshop 1 when set working directly, the console window will show1:
setwd("~/Desktop/Teaching/BS2280/RWorkshop1") # for Mac users
setwd("C:/Teaching/BS2280/RWorkshop1") # for Windows users
Step 3. Open an R script file before you start with your analysis.
Click on the little white sheet with the white plus over a green circle to open an R script:
R script file is just a text file, which can help us to save codes.
Step 4. Type the codes in R script file.
Type ths codes in R script and you can add comments to your commands through the usage of a hash tag, so that you have commentary about your work and you won't forget what you have done previously. The words after a hash tag will not be executed, for example
5+2 # Summation
However, by just typing these commands into the script file nothing is actually happening. If you want R to execute any of the commands in your script file you have to do one of the following:
1. Press the "Run" button, in which case R will run the command in which the mouse curser currently is
2. Press the "CTRL"+"ENTER" on the keyboard, in which case R will run the command in which the mouse curser currently is
3. Highlight several lines and press "CTRL"+"ENTER" on the keyboard, in which case R will run the commands in the highlighted lines
Example
5+2 # Summation
Type the above command in the R script and put the mouse cursor next to the command
Click Run and you will see the Console window shows the command that has been executed. Here, R has calculated 5+2 and give you the result 7.
With some practice, you will realise that using the keyboard instead of the mouse will be much more time-efficient.
Step 5. Save your work.
Different ways exist to save your work.
Save the file in .R format. The R script will contain all your commands you have used in the current session and allows you to replicate what we have done in the future.
Click the disk icon. In the new window, name the R script file and save it in the folder you created in the step 1 (e.g. RWorkshop1).
For more complex tasks that require you to reproduce your work, use R scripts. R scripts, which are saved in '.R' format, are like a written recipe of your work which allows you to reproduce your work.2 If you use R scripts, you do not have to save your workspace image. I recommend to create an R script for each of your workshops.
If you save your session as a workspace image, all your objects you
have created will ba saved in a .Rdata
file. You can load this work image file to continue working where you
have left off. You can either save it via the menu, or use the command
line:
save.image("MyRsession.Rdata")
Here, I saved the workspace image under the name 'MyRsession.Rdata'. R will also automatically create a '.history' file that is saved alongside the workspace image. You can open it with any text editor, such as Notepad, Emacs, Vim, Pico or Nano. It contains a line-by-line history of all the commands that have been executed in the console window for the associated session.
Quick exercise:
To facilitate my workflow in R,
First, I need to to save my worksheets and datasets.
Second, I need to to tell R which folder I have created to save the datasets so that R can find the datasets.
Third, I need to because it can help me to save codes.
Fourth, I need to where I can add comments to my codes and I won't forget what I have done before.
Last, I need to for future use.
Let's start with some basics and use R as a calculator. All the standard mathematical operations can be undertaken with R. Type these codes in your R script file and click Run!
5+2 # Summation
## [1] 7
5*2 # Multiplication
## [1] 10
5/2 # Division
## [1] 2.5
5^2 # Power
## [1] 25
sqrt(5) # Square Root
## [1] 2.236068
You can also store any of your results as an R object and use it for further calculations. Type these codes in your R script file and click Run! For example, let's assign the result of a simple mathematical calculation to the object x:
x <- 5*2
x
## [1] 10
X <- 3*2
X
## [1] 6
Note that x and X are different variables, so R is case-sensitive!
We assign a value to an object with
<- ('smaller than' sign followed by a
'minus'). To print3 the value of the object, we just type its
name into the command line. Let's assign the square of x to a new object
y:
y <- x^2
y
## [1] 100
The base installation of R comes already with an impressive number of
in-built functions, e.g. the sqrt() function that we have
used in the previous exercise. However, for more specialised statistical
techniques, we have to install and activate R packages that are
developed and maintained by a large number of developers of the R
community.
In R, functions are stored in packages. To use a specific function, you first need to install the relevant package and then activate it into your session. Once the package is activated, you can access and use the functions it contains.
For instance, if we intend to use the read_excel()
function for importing Excel data in R, it's essential to note that
read_excel() is stored in a package called
readxl
Therefore, to make use of read_excel(), you must first
install and activate the readxl package. Once activated,
you will have access to the "read_excel" function within the
readxl package for data importation.
Two ways to install and activate packages in R.
The easiest way to install packages is through the RStudio package window (that's the window with the yellow border in Figure 1).
Click on the Packages tab and you will see all the packages that are already installed. To activate them, tick the box next to the package name.
If the required package is not there, you can install it via the Install tab. In the newly opened window, just type in the R package name and click 'install'. To update the installed R packages click on the Update tab that is just next to the Install tab.
You can also use the command line to install, activate and update
packages. Assume you want to install the R-package
readxl:
install.packages("readxl")
After the installation we activate the package with:
library("readxl")
Key point:
Install package: just do once
Activate package: do it every time you need
This means that you have installed the
readxl package this time. In future
instances, there is no need to reinstall it. However, you will need to
activate it by typing the command library(readxl) every
time you want to use the read_excel() function stored in
it.
Quick exercise:
What function is used to install a package in R?
What function activates a package that is already on your computer?
We install packages in R
We activate packages in R
To update all installed packages, type:
update.packages
Finally, if you should ever struggle with some of R's commands, a
look into R's help-files can be very helpful. To access the help file,
you have to type into the console window '?' and then the command name.
For example, if you want to know more about the command
sqrt(), type the following:
?sqrt()
The quickest way to close R is to use the command line:
q()
Now you are ready for the first data exercise - R Workshop 1.
This is an example. You may have given your directory a different name. Besides, this is a Windows user example, Mac users will get slightly different results↩︎
If you have used Stata before, R scripts are equivalent to Stata do-files.↩︎
'Print' is used synonymously for 'display', so you don't have to worry that your printer will switch itself on!↩︎